Regularization in higher dimensions by YuanbinLiu · Pull Request #354 · autoatml/autoplex

YuanbinLiu · 2025-03-07T17:46:18Z

We have upgraded the functionality to support high-dimensional convex hull calculations with regularization. Previously, it was limited to 3D convex hulls, but we are now able to handle higher-dimensional cases, such as 4D convex hulls for three-element systems.

Test 1: Verify whether the new code can produce the same convex hull results as the old version for the 3D case.
Test 2: Evaluate whether the new code can handle high-dimensional (>3D) convex hulls.

JaGeo · 2025-03-07T18:08:53Z

Thank you @YuanbinLiu ! Some tests are still failing.

Can we handle anything beyond 4D? If not, should we add this limitation to the documentation?

YuanbinLiu · 2025-03-07T18:29:39Z

Thank you @YuanbinLiu ! Some tests are still failing.

Can we handle anything beyond 4D? If not, should we add this limitation to the documentation?

It's not limited to 4D; higher dimensions are also fesible. I'll find time to check for errors, but the tests passed before I pushed the PR.

JaGeo · 2025-03-07T18:34:35Z

@YuanbinLiu Thanks for the explanation!

Test results can be dependent on the operating system

YuanbinLiu · 2025-03-07T19:40:31Z

@YuanbinLiu Thanks for the explanation!

Test results can be dependent on the operating system

I have tried several times, all passed. I didn't found some bugs. Can anyone test it on a different system?

JaGeo · 2025-03-07T19:45:01Z

+            f"Point and points must have the same dimensionality. Got {pn.shape[0]} and {preg.shape[1]}."
+        )
+    hull = ConvexHull(preg)
+    return np.all(np.dot(hull.equations[:, :-1], pn) + hull.equations[:, -1] <= 1e-12)


Is this floating point comparison maybe brittle?

It shouldn't be caused by this. I tested it on different computers with new environments, and it worked fine.

JaGeo · 2025-03-07T19:46:07Z

-    fraction_list = [[1.0]] + [[0.0]] + [[0.5]] * 8
+    calc_hull_ND = calculate_hull_nd(label)
+
+    assert np.allclose(calc_hull_3D.equations, calc_hull_ND.equations, atol=1e-6)


Different tolerance needed?

JaGeo · 2025-03-07T19:46:22Z

+        get_e_dist_hull_ND = get_e_distance_to_hull_nd(
+            calc_hull_ND, atom, {3: -0.28649227, 17: -0.25638457}, "REF_energy"
+        )
+        assert np.allclose(get_e_dist_hull_3D, get_e_dist_hull_ND, atol=1e-6)


JaGeo · 2025-03-07T19:50:00Z

@YuanbinLiu could you check all numerical tolerances again and especially for the failing test? It is likely just the tolerance

JaGeo · 2025-03-08T19:54:25Z

    ]

-    assert all(d >= -1e-6 for d in des)
+    assert np.all(np.array(des) >= -1e6)


Shouldn't it be -1e-6?

Oh yes, missed that -, thats why test passed seems, I just saw the both methods returns 1e6 if it fails
https://github.com/YuanbinLiu/autoplex_pub/blob/76c29c6c3787c1cc4cd30ec13c3f42969da6296a/src/autoplex/fitting/common/regularization.py#L653
https://github.com/YuanbinLiu/autoplex_pub/blob/76c29c6c3787c1cc4cd30ec13c3f42969da6296a/src/autoplex/fitting/common/regularization.py#L653

If some configurations have a distance from the convex hull exceeding 1e6, they will be excluded from the training set afterwards.

naik-aakash · 2025-03-08T19:55:43Z

@YuanbinLiu could you check all numerical tolerances again and especially for the failing test? It is likely just the tolerance

Are the values supposed to be non zero, positive , negative ? Is there any specific range this values are suppose to be ? Also with this new function added supporting n dimensions, do we still need 3d method or it can be replaced with this new function now ? Can you comment on this @YuanbinLiu ?

YuanbinLiu · 2025-03-10T17:05:48Z

@YuanbinLiu could you check all numerical tolerances again and especially for the failing test? It is likely just the tolerance

Are the values supposed to be non zero, positive , negative ? Is there any specific range this values are suppose to be ? Also with this new function added supporting n dimensions, do we still need 3d method or it can be replaced with this new function now ? Can you comment on this @YuanbinLiu ?

They should be non-negative (i.e., >= 0). Actually, we no longer need 3D, as this has already been handled in the new function.

QuantumChemist · 2025-03-12T10:48:16Z

    regularization: bool = False,
    retain_existing_sigma: bool = False,
    scheme: str = "linear-hull",
+    element_order: list = None,


Suggested change

element_order: list = None,

element_order: list | None = None,

list = None would cause a data type mismatch for list

QuantumChemist · 2025-03-12T10:51:06Z

+                norm = np.cross(n_d[2] - n_d[0], n_d[1] - n_d[0])
+                plane_norm = norm / np.linalg.norm(norm)
+            else:
+                A = n_d[:-1] - n_d[0]


maybe clear variable names would be a bit better?

Thanks. Changed.

naik-aakash · 2025-04-10T17:11:49Z

Hi @YuanbinLiu , seems fine to me now, just wondering if we still need this method now ? https://github.com/YuanbinLiu/autoplex_pub/blob/a8bddf095e72ba300f7046ab347218feae6a5f20/src/autoplex/fitting/common/regularization.py#L529

As the new one added should basically work or any case. Maybe we can remove it, as we will not use it?

Also tagging @JaGeo , if you have any thoughts on it as well.

JaGeo · 2025-04-10T17:20:21Z

If the new method replaces it completely, we should probably remove it as we will otherwise lose track of it

JaGeo · 2025-04-10T20:25:33Z

+    element_order: list | None
+        List of atomic numbers in order of choice (e.g. [42, 16] for MoS2)


could we determine this automatically?

Good question. I would say it's definitely doable, but modifying it involves more code changes. I'll be updating some features of RSS in the next PR, and I'll include this functionality in that update.

JaGeo · 2025-04-10T20:26:20Z

+                - (plane_constant - np.dot(plane_norm[:-1], sp[:-1])) / plane_norm[-1]
+            )
+
+    print("Failed to find distance to hull in ND")


logging instead?

Ahh, yes, done.

JaGeo · 2025-04-10T20:28:08Z

    scheme: str
        Scheme to use for regularization.
+    element_order:
+        List of atomic numbers in order of choice (e.g. [42, 16] for MoS2)


could we extend the doc string? I am not sure I fully understand what this is for (e.g., the sigma)

This is introduced to support solving high-dimensional energy convex hulls. Sure, I will extend it accordingly.

naik-aakash · 2025-04-10T20:32:01Z

+    energy = (
+        atoms.info[energy_name] / len(atoms)
+        if energy_name != "energy"
+        else atoms.get_potential_energy() / len(atoms)


Why do we need this additional else condition? Wouldn't this fail if no calculator is attached to the atom's object? or it still works fine?

This is because in the new version of ASE, if the energy label is "energy", it no longer supports reading from info, but instead uses get_potential_energy(). However, if the user specifies a different energy label, it will be read from info according to their setting.

Oh okay. Thanks did not know this was changed.

YuanbinLiu · 2025-04-10T20:32:52Z

Hi @YuanbinLiu , seems fine to me now, just wondering if we still need this method now ? https://github.com/YuanbinLiu/autoplex_pub/blob/a8bddf095e72ba300f7046ab347218feae6a5f20/src/autoplex/fitting/common/regularization.py#L529

As the new one added should basically work or any case. Maybe we can remove it as we will not use it?

Also tagging @JaGeo , if you have any thoughts on it as well.

Good suggestions. They were safely removed now.

JaGeo · 2025-04-11T11:54:19Z

Thanks!

Regularization in higher dimensions

9ef6bfe

YuanbinLiu requested review from JaGeo, MorrowChem, naik-aakash and nfragapane March 7, 2025 17:46

YuanbinLiu mentioned this pull request Mar 7, 2025

Issue with volume-stoichiometry regularisation scheme #337

Closed

Minor adjustment

92e6ece

JaGeo reviewed Mar 7, 2025

View reviewed changes

naik-aakash reviewed Mar 8, 2025

View reviewed changes

Comment thread tests/fitting/common/test_regularization.py Outdated

naik-aakash reviewed Mar 8, 2025

View reviewed changes

Comment thread tests/fitting/common/test_regularization.py Outdated

naik-aakash reviewed Mar 8, 2025

View reviewed changes

Comment thread tests/fitting/common/test_regularization.py

naik-aakash added 3 commits March 8, 2025 20:25

Update tests/fitting/common/test_regularization.py

8021db8

Update tests/fitting/common/test_regularization.py

2f16a1e

Update tests/fitting/common/test_regularization.py

76c29c6

JaGeo reviewed Mar 8, 2025

View reviewed changes

naik-aakash reviewed Mar 8, 2025

View reviewed changes

Comment thread tests/fitting/common/test_regularization.py Outdated

Fix condition check

bdb0ca5

YuanbinLiu added 2 commits March 10, 2025 17:41

Unit test update

0240b63

Unit test update

2fd91e4

QuantumChemist reviewed Mar 12, 2025

View reviewed changes

check package list

618cd06

naik-aakash reviewed Mar 12, 2025

View reviewed changes

Comment thread .github/workflows/python-package.yml Outdated

YuanbinLiu and others added 5 commits March 12, 2025 15:28

Revert

d195c19

fix a bug

a43624d

pre-commit auto-fixes

c3b1308

fix a lint issue

6649fac

Merge branch 'debug' of github.com:YuanbinLiu/autoplex_pub into debug

a8bddf0

JaGeo reviewed Apr 10, 2025

View reviewed changes

Remove duplicate functions

9e309a7

naik-aakash reviewed Apr 10, 2025

View reviewed changes

Minor adjustment

8692c35

JaGeo merged commit 512436e into autoatml:main Apr 11, 2025
17 checks passed

naik-aakash added the bug Something isn't working label Apr 27, 2025

	element_order: list = None,
	element_order: list \| None = None,

		element_order: list \| None
		List of atomic numbers in order of choice (e.g. [42, 16] for MoS2)

Conversation

YuanbinLiu commented Mar 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

JaGeo commented Mar 7, 2025

Uh oh!

YuanbinLiu commented Mar 7, 2025

Uh oh!

JaGeo commented Mar 7, 2025

Uh oh!

YuanbinLiu commented Mar 7, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

JaGeo commented Mar 7, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

naik-aakash Mar 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

YuanbinLiu Mar 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

naik-aakash commented Mar 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

YuanbinLiu commented Mar 10, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

naik-aakash commented Apr 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

JaGeo commented Apr 10, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

naik-aakash Apr 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

YuanbinLiu commented Apr 10, 2025

Uh oh!

Uh oh!

YuanbinLiu commented Mar 7, 2025 •

edited

Loading

naik-aakash Mar 8, 2025 •

edited

Loading

YuanbinLiu Mar 10, 2025 •

edited

Loading

naik-aakash commented Mar 8, 2025 •

edited

Loading

naik-aakash commented Apr 10, 2025 •

edited

Loading

naik-aakash Apr 10, 2025 •

edited

Loading